2,003 research outputs found
Searches for TeV-scale particles at the LHC using jet shapes
New particles at the TeV scale can decay hadronically with strongly
collimated jets, thus the standard reconstruction methods based on
invariant-masses of well-separated jets can fail. We discuss how to identify
such particles in pp collisions at the LHC using jet shapes which help to
reduce the contribution of QCD-induced events. We focus on a rather generic
example X to ttbar to hadrons, with X being a heavy particle, but the approach
is well suited for reconstruction of other decay channels characterized by a
cascade decay of known states.Comment: 14 pages, 6 figure
EffiTest: Efficient Delay Test and Statistical Prediction for Configuring Post-silicon Tunable Buffers
At nanometer manufacturing technology nodes, process variations significantly
affect circuit performance. To combat them, post- silicon clock tuning buffers
can be deployed to balance timing bud- gets of critical paths for each
individual chip after manufacturing. The challenge of this method is that path
delays should be mea- sured for each chip to configure the tuning buffers
properly. Current methods for this delay measurement rely on path-wise
frequency stepping. This strategy, however, requires too much time from ex-
pensive testers. In this paper, we propose an efficient delay test framework
(EffiTest) to solve the post-silicon testing problem by aligning path delays
using the already-existing tuning buffers in the circuit. In addition, we only
test representative paths and the delays of other paths are estimated by
statistical delay prediction. Exper- imental results demonstrate that the
proposed method can reduce the number of frequency stepping iterations by more
than 94% with only a slight yield loss.Comment: ACM/IEEE Design Automation Conference (DAC), June 201
Principal component analysis - an efficient tool for variable stars diagnostics
We present two diagnostic methods based on ideas of Principal Component
Analysis and demonstrate their efficiency for sophisticated processing of
multicolour photometric observations of variable objects.Comment: 8 pages, 4 figures. Published alread
How Algorithmic Confounding in Recommendation Systems Increases Homogeneity and Decreases Utility
Recommendation systems are ubiquitous and impact many domains; they have the
potential to influence product consumption, individuals' perceptions of the
world, and life-altering decisions. These systems are often evaluated or
trained with data from users already exposed to algorithmic recommendations;
this creates a pernicious feedback loop. Using simulations, we demonstrate how
using data confounded in this way homogenizes user behavior without increasing
utility
Principal Component Analysis with Noisy and/or Missing Data
We present a method for performing Principal Component Analysis (PCA) on
noisy datasets with missing values. Estimates of the measurement error are used
to weight the input data such that compared to classic PCA, the resulting
eigenvectors are more sensitive to the true underlying signal variations rather
than being pulled by heteroskedastic measurement noise. Missing data is simply
the limiting case of weight=0. The underlying algorithm is a noise weighted
Expectation Maximization (EM) PCA, which has additional benefits of
implementation speed and flexibility for smoothing eigenvectors to reduce the
noise contribution. We present applications of this method on simulated data
and QSO spectra from the Sloan Digital Sky Survey.Comment: Accepted for publication in PASP; v2 with minor updates, mostly to
bibliograph
Data Mining to Uncover Heterogeneous Water Use Behaviors From Smart Meter Data
Knowledge on the determinants and patterns of water demand for different consumers supports the design of customized demand management strategies. Smart meters coupled with big data analytics tools create a unique opportunity to support such strategies. Yet, at present, the information content of smart meter data is not fully mined and usually needs to be complemented with water fixture inventory and survey data to achieve detailed customer segmentation based on end use water usage. In this paper, we developed a data‐driven approach that extracts information on heterogeneous water end use routines, main end use components, and temporal characteristics, only via data mining existing smart meter readings at the scale of individual households. We tested our approach on data from 327 households in Australia, each monitored with smart meters logging water use readings every 5 s. As part of the approach, we first disaggregated the household‐level water use time series into different end uses via Autoflow. We then adapted a customer segmentation based on eigenbehavior analysis to discriminate among heterogeneous water end use routines and identify clusters of consumers presenting similar routines. Results revealed three main water end use profile clusters, each characterized by a primary end use: shower, clothes washing, and irrigation. Time‐of‐use and intensity‐of‐use differences exist within each class, as well as different characteristics of regularity and periodicity over time. Our customer segmentation analysis approach provides utilities with a concise snapshot of recurrent water use routines from smart meter data and can be used to support customized demand management strategies.TU Berlin, Open-Access-Mittel - 201
Significance analysis and statistical mechanics: an application to clustering
This paper addresses the statistical significance of structures in random
data: Given a set of vectors and a measure of mutual similarity, how likely
does a subset of these vectors form a cluster with enhanced similarity among
its elements? The computation of this cluster p-value for randomly distributed
vectors is mapped onto a well-defined problem of statistical mechanics. We
solve this problem analytically, establishing a connection between the physics
of quenched disorder and multiple testing statistics in clustering and related
problems. In an application to gene expression data, we find a remarkable link
between the statistical significance of a cluster and the functional
relationships between its genes.Comment: to appear in Phys. Rev. Let
High-Dimensional Inference with the generalized Hopfield Model: Principal Component Analysis and Corrections
We consider the problem of inferring the interactions between a set of N
binary variables from the knowledge of their frequencies and pairwise
correlations. The inference framework is based on the Hopfield model, a special
case of the Ising model where the interaction matrix is defined through a set
of patterns in the variable space, and is of rank much smaller than N. We show
that Maximum Lik elihood inference is deeply related to Principal Component
Analysis when the amp litude of the pattern components, xi, is negligible
compared to N^1/2. Using techniques from statistical mechanics, we calculate
the corrections to the patterns to the first order in xi/N^1/2. We stress that
it is important to generalize the Hopfield model and include both attractive
and repulsive patterns, to correctly infer networks with sparse and strong
interactions. We present a simple geometrical criterion to decide how many
attractive and repulsive patterns should be considered as a function of the
sampling noise. We moreover discuss how many sampled configurations are
required for a good inference, as a function of the system size, N and of the
amplitude, xi. The inference approach is illustrated on synthetic and
biological data.Comment: Physical Review E: Statistical, Nonlinear, and Soft Matter Physics
(2011) to appea
Mesoscopic Model for Free Energy Landscape Analysis of DNA sequences
A mesoscopic model which allows us to identify and quantify the strength of
binding sites in DNA sequences is proposed. The model is based on the
Peyrard-Bishop-Dauxois model for the DNA chain coupled to a Brownian particle
which explores the sequence interacting more importantly with open base pairs
of the DNA chain. We apply the model to promoter sequences of different
organisms. The free energy landscape obtained for these promoters shows a
complex structure that is strongly connected to their biological behavior. The
analysis method used is able to quantify free energy differences of sites
within genome sequences.Comment: 7 pages, 5 figures, 1 tabl
Nonequilibrium Invariant Measure under Heat Flow
We provide an explicit representation of the nonequilibrium invariant measure
for a chain of harmonic oscillators with conservative noise in the presence of
stationary heat flow. By first determining the covariance matrix, we are able
to express the measure as the product of Gaussian distributions aligned along
some collective modes that are spatially localized with power-law tails.
Numerical studies show that such a representation applies also to a purely
deterministic model, the quartic Fermi-Pasta-Ulam chain
- …